Skip to content

fix: correctly call predict for OLS in CUPAC#12

Open
mnicky wants to merge 1 commit intokolmogorov-lab:mainfrom
mnicky:patch-1
Open

fix: correctly call predict for OLS in CUPAC#12
mnicky wants to merge 1 commit intokolmogorov-lab:mainfrom
mnicky:patch-1

Conversation

@mnicky
Copy link
Copy Markdown

@mnicky mnicky commented Jun 5, 2024

OLS predict() should be called on the fitted model.

At least in my environment, the original version ends with an exception:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[59], line 1
----> 1 ab_test_cupac = ABTest(abtest, ab_params).cupac()

File /opt/conda/envs/py311/lib/python3.11/site-packages/abacus/auto_ab/abtest.py:668, in ABTest.cupac(self)
    666 self.__check_required_metric_type("cupac")
    667 self.__check_required_columns(self.__dataset, "cupac")
--> 668 result_df = VarianceReduction.cupac(
    669     self.__dataset,
    670     target_prev_col=self.params.data_params.target_prev,
    671     target_now_col=self.params.data_params.target,
    672     factors_prev_cols=self.params.data_params.predictors_prev,
    673     factors_now_cols=self.params.data_params.predictors_now,
    674     groups_col=self.params.data_params.group_col,
    675 )
    677 params_new = copy.deepcopy(self.params)
    678 params_new.data_params.control = self.__get_group(
    679     self.params.data_params.control_name, result_df
    680 )

File /opt/conda/envs/py311/lib/python3.11/site-packages/abacus/auto_ab/variance_reduction.py:98, in VarianceReduction.cupac(cls, x, target_prev_col, target_now_col, factors_prev_cols, factors_now_cols, groups_col)
     79 """Perform CUPED on target variable with covariate calculated
     80 as a prediction from a linear regression model.
     81 
   (...)
     93     pandas.DataFrame: Pandas DataFrame with additional columns: target_pred and target_now_cuped
     94 """
     95 x = cls._target_encoding(
     96     x, list(set(factors_prev_cols + factors_now_cols)), target_prev_col
     97 )
---> 98 x.loc[:, "target_pred"] = cls._predict_target(
     99     x, target_prev_col, factors_prev_cols, factors_now_cols
    100 )
    101 x_new = cls.cuped(x, target_now_col, groups_col, "target_pred")
    102 return x_new

Cell In[58], line 30, in predict_target(x, target_prev_col, factors_prev_cols, factors_now_cols)
     27 print(results.summary())
     28 x_predict = x[factors_now_cols]
---> 30 return model.predict(x_predict)

File /opt/conda/envs/py311/lib/python3.11/site-packages/statsmodels/regression/linear_model.py:411, in RegressionModel.predict(self, params, exog)
    408 if exog is None:
    409     exog = self.exog
--> 411 return np.dot(exog, params)

File <__array_function__ internals>:200, in dot(*args, **kwargs)

ValueError: shapes (2400372,5) and (2400372,5) not aligned: 5 (dim 1) != 2400372 (dim 0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant